118 research outputs found
A stroll along the gamma
We provide the first in-depth study of the "smart path" interpolation between
an arbitrary probability measure and the gamma-
distribution. We propose new explicit representation formulae for the ensuing
process as well as a new notion of relative Fisher information with a gamma
target distribution. We use these results to prove a differential and an
integrated De Bruijn identity which hold under minimal conditions, hereby
extending the classical formulae which follow from Bakry, Emery and Ledoux's
-calculus. Exploiting a specific representation of the "smart path", we
obtain a new proof of the logarithmic Sobolev inequality for the gamma law with
as well as a new type of HSI inequality linking relative
entropy, Stein discrepancy and standardized Fisher information for the gamma
law with .Comment: Typos correcte
Stein-type covariance identities: Klaassen, Papathanasiou and Olkin-Shepp type bounds for arbitrary target distributions
In this paper, we present a minimal formalism for Stein operators which leads
to different probabilistic representations of solutions to Stein equations.
These in turn provide a wide family of Stein-Covariance identities which we put
to use for revisiting the very classical topic of bounding the variance of
functionals of random variables. Applying the Cauchy-Schwarz inequality yields
first order upper and lower Klaassen-type variance bounds. A probabilistic
representation of Lagrange's identity (i.e. Cauchy-Schwarz with remainder)
leads to Papathanasiou-type variance expansions of arbitrary order. A matrix
Cauchy-Schwarz inequality leads to Olkin-Shepp type covariance bounds. All
results hold for univariate target distribution under very weak assumptions (in
particular they hold for continuous and discrete distributions alike). Many
concrete illustrations are provided
On the rate of convergence in de Finetti's representation theorem
A consequence of de Finetti's representation theorem is that for every
infinite sequence of exchangeable 0-1 random variables , there
exists a probability measure on the Borel sets of such that converges weakly to . For a wide class of
probability measures having smooth density on , we give bounds of
order with explicit constants for the Wasserstein distance between the
law of and . This extends a recent result {by} Goldstein and
Reinert \cite{goldstein2013stein} regarding the distance between the scaled
number of white balls drawn in a P\'olya-Eggenberger urn and its limiting
distribution. We prove also that, in the most general cases, the distance
between the law of and is bounded below by and above by
(up to some multiplicative constants). For every , we give an example of an exchangeable sequence such that this
distance is of order
Distances between nested densities and a measure of the impact of the prior in Bayesian statistics
In this paper we propose tight upper and lower bounds for the Wasserstein
distance between any two {{univariate continuous distributions}} with
probability densities and having nested supports. These explicit
bounds are expressed in terms of the derivative of the likelihood ratio
as well as the Stein kernel of . The method of proof
relies on a new variant of Stein's method which manipulates Stein operators.
We give several applications of these bounds. Our main application is in
Bayesian statistics : we derive explicit data-driven bounds on the Wasserstein
distance between the posterior distribution based on a given prior and the
no-prior posterior based uniquely on the sampling distribution. This is the
first finite sample result confirming the well-known fact that with
well-identified parameters and large sample sizes, reasonable choices of prior
distributions will have only minor effects on posterior inferences if the data
are benign
The Adaptive Sampling Revisited
The problem of estimating the number of distinct keys of a large
collection of data is well known in computer science. A classical algorithm
is the adaptive sampling (AS). can be estimated by , where is
the final bucket (cache) size and is the final depth at the end of the
process. Several new interesting questions can be asked about AS (some of them
were suggested by P.Flajolet and popularized by J.Lumbroso). The distribution
of is known, we rederive this distribution in a simpler way.
We provide new results on the moments of and . We also analyze the final
cache size distribution. We consider colored keys: assume that among the
distinct keys, do have color . We show how to estimate
. We also study colored keys with some multiplicity given by
some distribution function. We want to estimate mean an variance of this
distribution. Finally, we consider the case where neither colors nor
multiplicities are known. There we want to estimate the related parameters. An
appendix is devoted to the case where the hashing function provides bits with
probability different from
On Hodges and Lehmann's " result"
While the asymptotic relative efficiency (ARE) of Wilcoxon rank-based tests
for location and regression with respect to their parametric Student
competitors can be arbitrarily large, Hodges and Lehmann (1961) have shown that
the ARE of the same Wilcoxon tests with respect to their van der Waerden or
normal-score counterparts is bounded from above by . In
this paper, we revisit that result, and investigate similar bounds for
statistics based on Student scores. We also consider the serial version of this
ARE. More precisely, we study the ARE, under various densities, of the
Spearman-Wald-Wolfowitz and Kendall rank-based autocorrelations with respect to
the van der Waerden or normal-score ones used to test (ARMA) serial dependence
alternatives
- …